This archive contains the data and scripts for replicating the findings reported in Rauh, Christian (2022) 'Clear messages to the European public? The language of European Commission press releases 1985-2020', forthcoming in the Journal of European Integration.
This includes, a.o., an original full text corpus of all 44,978 press releases that the European Commission has published between January 17, 1985, and January 8, 2021.

The replication package is wrapped in an RStudio project, if you initiate 'CommissionCommunication.Rproj' first all relative paths in the scripts will work.
All analyses have been implemented in R version 4.0.3; additional packages and their version number are specified at the beginning of the respective script.



Original text corpora (in "./Corpora"):

# EC-PressReleases_1985-2020_clean.RDS	# 44,978 press releases the European Commission has issued between 1985 and 2020
# IRE-GovPressReleases.Rds				# 6,718 press releases from the Irish government
# UK-GovPressReleases.Rds				# 105,367 press releses from the UK governenment
# PolSciAbstracts.rds					# 2,447 political science abstarcts from five major journals
# BNC_RawTexts.rds						# 4,049 paragraphs sampled from British newspapers (drawn from the British National Corpus)


Scripts (in root folder):

1_PRs-Output.R 						# Plot output: monthly number of Commission press releases over time (Figure 1 in the main text)
2_PRs-ClarityIndicators.R			# Extract language clarity indicators from COM, UK and IRE press releases
3_Comp-ClarityIndicators.R			# Extract language clarity indicators from PolSci abstracts and newspaper texts
4_PRs-DescriptivePlot.R				# Compare language clarity across corpora and time (Figure 2 in the main text)
4_PRs-DescriptivePlot_greyscale.R	# Same as above, output in greyscale (sigh ...)
5_PRs-TextMatching.R				# Estimate topic models and compare language clarity by topic-matched PRs (a.o. Figure 3 in the main text)

X_Eurojargon.R						# Compare word frequencies across Comm PRs and benchmark corpora (a.o., Figure A2 in the appendix)
X_SearchK.R							# Compare tpoic fit indicators across different values of k (Figure A3 in the appendix) 

X_Scraper_Principles.R				# Basic setup of the web scraper for the onlione archive of  European Commission press releases (headless browser phantomjs also provided)

